Various forms of regularization in learning tasks strive for different notions of simplicity. This paper presents a spectral regularization technique, which attaches a unique inductive bias to sequence modeling based on an intuitive concept of simplicity defined in the Chomsky hierarchy. From fundamental connections between Hankel matrices and regular grammars, we propose to use the trace norm of the Hankel matrix, the tightest convex relaxation of its rank, as the spectral regularizer. To cope with the fact that the Hankel matrix is bi-infinite, we propose an unbiased stochastic estimator for its trace norm. Ultimately, we demonstrate experimental results on Tomita grammars, which exhibit the potential benefits of spectral regularization and validate the proposed stochastic estimator.
translated by 谷歌翻译
加权有限自动机(WFA)已广泛应用于许多领域。 WFA的经典问题之一是对离散符号序列的概率分布估计。尽管已扩展了WFA以处理连续输入数据,即连续WFA(CWFA),但由于使用基于WFA的模型,如何将密度函数近似于连续随机变量的序列,这是由于限制了模型的表现。以及通过CWFA的近似密度函数的障碍性。在本文中,我们提出了对CWFA模型的非线性扩展,以提高其表现力,我们将其称为非线性连续WFA(NCWFAS)。然后,我们利用所谓的RNADE方法,该方法是基于神经网络的众所周知的密度估计器,并提出了RNADE-NCWFA模型。 RNADE-NCWFA模型通过设计计算密度函数。我们表明,该模型比CWFA无法近似的高斯HMM模型严格表现得更具表现力。从经验上讲,我们使用高斯HMM生成的数据进行了合成实验。我们专注于评估模型估计长度序列的密度(长度长于训练数据)的能力。我们观察到我们的模型在比较基线方法中表现最好。
translated by 谷歌翻译
在本文中,我们在不同研究领域使用的三种模型之间存在联系:来自正式语言和语言学的加权有限自动机〜(WFA),机器学习中使用的经常性神经网络,以及张量网络,包括一组高处的优化技术量子物理学和数值分析中使用的顺序张量。我们首先介绍WFA与张力列车分解,特定形式的张量网络之间的内在关系。该关系允许我们展示由WFA计算的函数的Hankel矩阵的新型低级结构,并设计利用这种结构的有效光谱学习算法来扩展到非常大的Hankel矩阵。我们将解开基本连接在WFA和第二阶逆转神经网络之间〜(2-RNN):在离散符号的序列的情况下,具有线性激活功能的WFA和2-RNN是表现性的。利用该等效结果与加权自动机的经典频谱学习算法相结合,我们介绍了在连续输入向量序列上定义的线性2-RNN的第一可提供学习算法。本算法依赖于Hankel Tensor的低等级子块,可以从中可以从中恢复线性2-RNN的参数。在综合性和现实世界数据的仿真研究中评估了所提出的学习算法的性能。
translated by 谷歌翻译
In this paper, we present a framework for learning quadruped navigation by integrating central pattern generators (CPGs), i.e. systems of coupled oscillators, into the deep reinforcement learning (DRL) framework. Through both exteroceptive and proprioceptive sensing, the agent learns to modulate the intrinsic oscillator setpoints (amplitude and frequency) and coordinate rhythmic behavior among different oscillators to track velocity commands while avoiding collisions with the environment. We compare different neural network architectures (i.e. memory-free and memory-enabled) which learn implicit interoscillator couplings, as well as varying the strength of the explicit coupling weights in the oscillator dynamics equations. We train our policies in simulation and perform a sim-to-real transfer to the Unitree Go1 quadruped, where we observe robust navigation in a variety of scenarios. Our results show that both memory-enabled policy representations and explicit interoscillator couplings are beneficial for a successful sim-to-real transfer for navigation tasks. Video results can be found at https://youtu.be/O_LX1oLZOe0.
translated by 谷歌翻译
Deep spiking neural networks (SNNs) offer the promise of low-power artificial intelligence. However, training deep SNNs from scratch or converting deep artificial neural networks to SNNs without loss of performance has been a challenge. Here we propose an exact mapping from a network with Rectified Linear Units (ReLUs) to an SNN that fires exactly one spike per neuron. For our constructive proof, we assume that an arbitrary multi-layer ReLU network with or without convolutional layers, batch normalization and max pooling layers was trained to high performance on some training set. Furthermore, we assume that we have access to a representative example of input data used during training and to the exact parameters (weights and biases) of the trained ReLU network. The mapping from deep ReLU networks to SNNs causes zero percent drop in accuracy on CIFAR10, CIFAR100 and the ImageNet-like data sets Places365 and PASS. More generally our work shows that an arbitrary deep ReLU network can be replaced by an energy-efficient single-spike neural network without any loss of performance.
translated by 谷歌翻译
Recently, extensive studies on photonic reinforcement learning to accelerate the process of calculation by exploiting the physical nature of light have been conducted. Previous studies utilized quantum interference of photons to achieve collective decision-making without choice conflicts when solving the competitive multi-armed bandit problem, a fundamental example of reinforcement learning. However, the bandit problem deals with a static environment where the agent's action does not influence the reward probabilities. This study aims to extend the conventional approach to a more general multi-agent reinforcement learning targeting the grid world problem. Unlike the conventional approach, the proposed scheme deals with a dynamic environment where the reward changes because of agents' actions. A successful photonic reinforcement learning scheme requires both a photonic system that contributes to the quality of learning and a suitable algorithm. This study proposes a novel learning algorithm, discontinuous bandit Q-learning, in view of a potential photonic implementation. Here, state-action pairs in the environment are regarded as slot machines in the context of the bandit problem and an updated amount of Q-value is regarded as the reward of the bandit problem. We perform numerical simulations to validate the effectiveness of the bandit algorithm. In addition, we propose a multi-agent architecture in which agents are indirectly connected through quantum interference of light and quantum principles ensure the conflict-free property of state-action pair selections among agents. We demonstrate that multi-agent reinforcement learning can be accelerated owing to conflict avoidance among multiple agents.
translated by 谷歌翻译
Fingerprints are key tools in climate change detection and attribution (D&A) that are used to determine whether changes in observations are different from internal climate variability (detection), and whether observed changes can be assigned to specific external drivers (attribution). We propose a direct D&A approach based on supervised learning to extract fingerprints that lead to robust predictions under relevant interventions on exogenous variables, i.e., climate drivers other than the target. We employ anchor regression, a distributionally-robust statistical learning method inspired by causal inference that extrapolates well to perturbed data under the interventions considered. The residuals from the prediction achieve either uncorrelatedness or mean independence with the exogenous variables, thus guaranteeing robustness. We define D&A as a unified hypothesis testing framework that relies on the same statistical model but uses different targets and test statistics. In the experiments, we first show that the CO2 forcing can be robustly predicted from temperature spatial patterns under strong interventions on the solar forcing. Second, we illustrate attribution to the greenhouse gases and aerosols while protecting against interventions on the aerosols and CO2 forcing, respectively. Our study shows that incorporating robustness constraints against relevant interventions may significantly benefit detection and attribution of climate change.
translated by 谷歌翻译
We discuss pattern languages for closed pattern mining and learning of interval data and distributional data. We first introduce pattern languages relying on pairs of intersection-based constraints or pairs of inclusion based constraints, or both, applied to intervals. We discuss the encoding of such interval patterns as itemsets thus allowing to use closed itemsets mining and formal concept analysis programs. We experiment these languages on clustering and supervised learning tasks. Then we show how to extend the approach to address distributional data.
translated by 谷歌翻译
The long-distance agreement, evidence for syntactic structure, is increasingly used to assess the syntactic generalization of Neural Language Models. Much work has shown that transformers are capable of high accuracy in varied agreement tasks, but the mechanisms by which the models accomplish this behavior are still not well understood. To better understand transformers' internal working, this work contrasts how they handle two superficially similar but theoretically distinct agreement phenomena: subject-verb and object-past participle agreement in French. Using probing and counterfactual analysis methods, our experiments show that i) the agreement task suffers from several confounders which partially question the conclusions drawn so far and ii) transformers handle subject-verb and object-past participle agreements in a way that is consistent with their modeling in theoretical linguistics.
translated by 谷歌翻译
Predicting the physical interaction of proteins is a cornerstone problem in computational biology. New classes of learning-based algorithms are actively being developed, and are typically trained end-to-end on protein complex structures extracted from the Protein Data Bank. These training datasets tend to be large and difficult to use for prototyping and, unlike image or natural language datasets, they are not easily interpretable by non-experts. We present Dock2D-IP and Dock2D-IF, two "toy" datasets that can be used to select algorithms predicting protein-protein interactions$\unicode{x2014}$or any other type of molecular interactions. Using two-dimensional shapes as input, each example from Dock2D-IP ("interaction pose") describes the interaction pose of two shapes known to interact and each example from Dock2D-IF ("interaction fact") describes whether two shapes form a stable complex or not. We propose a number of baseline solutions to the problem and show that the same underlying energy function can be learned either by solving the interaction pose task (formulated as an energy-minimization "docking" problem) or the fact-of-interaction task (formulated as a binding free energy estimation problem).
translated by 谷歌翻译